通过偶然反馈促进探索性运动可以积极影响婴儿期运动的发展。我们正在进行的工作装备通过使用小型空中机器人来开发机器人辅助的应急学习环境。本文研究了空中机器人及其相关的运动控制器是否可以用于为我们的目的实现高效且高度响应的机器人飞行。从视频中提取了婴儿踢动力学数据,并用空中机器人用于模拟和物理实验。评估了两个实践控制器的功效:线性PID和一个非线性几何控制器。通过平方平方误差(评估与输入婴儿腿轨迹信号的总体偏差)和动态时间扭曲算法(以量化信号同步),对机器人匹配婴儿踢轨迹的能力进行了定性和定量评估。结果表明,原则上可以跟踪使用小型空中机器人的婴儿踢轨迹,并确定提高跟踪质量所需的进一步发展领域。
translated by 谷歌翻译
动作识别是提高物理康复设备自治的重要组成部分,例如可穿戴机器人外骨骼。现有的人类行动识别算法的重点是成人应用,而不是小儿应用。在本文中,我们介绍了BabyNet,这是一个轻量重量(就可训练的参数而言)的网络结构,以识别婴儿从外体固定摄像机中采取行动的婴儿。我们开发了一个带注释的数据集,其中包括在不受约束的环境中的不同婴儿(例如,在家庭设置等)中的坐姿中执行的各种范围。我们的方法使用带注释的边界框的空间和时间连接来解释和抵消到达的开始,并检测到完整的到达动作。我们评估了我们提出的方法的效率,并将其性能与其他基于学习的网络结构进行比较,以捕获时间相互依存的能力和触及发作和偏移的检测准确性。结果表明,我们的婴儿网络可以在超过其他较大网络的(平均)测试准确性方面达到稳定的性能,因此可以作为基于视频的婴儿获得动作识别的轻量重量数据驱动框架。
translated by 谷歌翻译
本文介绍了基于织物的软气动执行器的设计和评估,其驱动需要低压要求,使其适用于婴儿的上肢辅助设备。目的是支持肩部绑架和内收,而无需禁止在其他平面上运动或阻塞肘关节运动。首先,通过模拟探索了具有内部空气电池的执行器设计家族的性能。执行器通过细胞数量及其宽度进行参数化。通过硬件实验进一步测试了通过模拟鉴定的物理可行的致动器变体。选择并根据婴儿的身体人为测量学的定制物理模型选择并测试两种设计。施加施加手臂的力,运动平滑度,路径长度和最大肩部角度的比较,请告知哪种设计更适合用作儿科可穿戴辅助设备的执行器,以及其他用于未来工作的见解。
translated by 谷歌翻译
这项工作着重于基于气动式柔软可穿戴设备的本体感受反馈的闭环控制,旨在将来支持婴儿完成任务。该设备包括两个柔软的气动执行器(一个基于纺织品和一个硅胶铸造),可积极控制每个手臂的两个自由度(分别为肩部内收/绑架和肘部屈曲/扩展)。可穿戴设备附加的惯性测量单元(IMU)提供实时关节角度反馈。通过文献中报道的婴儿(ARM长度)的人体测量数据来告知设备运动学分析。婴儿到达中的运动和肌肉共同激活模式被认为是为设备的最终效应器提供所需的轨迹。然后,开发了一个比例衍生的控制器来调节执行器内部的压力,然后沿着可及工作空间内的所需设定点移动手臂。提出了有关使用工程模特的跟踪所需的臂轨迹的实验结果,表明所提出的控制器可以帮助指导人体模特的腕部到达所需的设定点。
translated by 谷歌翻译
Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based classification to improve bladder tissue classification when annotations are limited in multi-domain data.
translated by 谷歌翻译
The receptive field (RF), which determines the region of time series to be ``seen'' and used, is critical to improve the performance for time series classification (TSC). However, the variation of signal scales across and within time series data, makes it challenging to decide on proper RF sizes for TSC. In this paper, we propose a dynamic sparse network (DSN) with sparse connections for TSC, which can learn to cover various RF without cumbersome hyper-parameters tuning. The kernels in each sparse layer are sparse and can be explored under the constraint regions by dynamic sparse training, which makes it possible to reduce the resource cost. The experimental results show that the proposed DSN model can achieve state-of-art performance on both univariate and multivariate TSC datasets with less than 50\% computational cost compared with recent baseline methods, opening the path towards more accurate resource-aware methods for time series analyses. Our code is publicly available at: https://github.com/QiaoXiao7282/DSN.
translated by 谷歌翻译
While the problem of hallucinations in neural machine translation has long been recognized, so far the progress on its alleviation is very little. Indeed, recently it turned out that without artificially encouraging models to hallucinate, previously existing methods fall short and even the standard sequence log-probability is more informative. It means that characteristics internal to the model can give much more information than we expect, and before using external models and measures, we first need to ask: how far can we go if we use nothing but the translation model itself ? We propose to use a method that evaluates the percentage of the source contribution to a generated translation. Intuitively, hallucinations are translations "detached" from the source, hence they can be identified by low source contribution. This method improves detection accuracy for the most severe hallucinations by a factor of 2 and is able to alleviate hallucinations at test time on par with the previous best approach that relies on external models. Next, if we move away from internal model characteristics and allow external tools, we show that using sentence similarity from cross-lingual embeddings further improves these results.
translated by 谷歌翻译
We pose video object segmentation as spectral graph clustering in space and time, with one graph node for each pixel and edges forming local space-time neighborhoods. We claim that the strongest cluster in this video graph represents the salient object. We start by introducing a novel and efficient method based on 3D filtering for approximating the spectral solution, as the principal eigenvector of the graph's adjacency matrix, without explicitly building the matrix. This key property allows us to have a fast parallel implementation on GPU, orders of magnitude faster than classical approaches for computing the eigenvector. Our motivation for a spectral space-time clustering approach, unique in video semantic segmentation literature, is that such clustering is dedicated to preserving object consistency over time, which we evaluate using our novel segmentation consistency measure. Further on, we show how to efficiently learn the solution over multiple input feature channels. Finally, we extend the formulation of our approach beyond the segmentation task, into the realm of object tracking. In extensive experiments we show significant improvements over top methods, as well as over powerful ensembles that combine them, achieving state-of-the-art on multiple benchmarks, both for tracking and segmentation.
translated by 谷歌翻译
Metric Elicitation (ME) is a framework for eliciting classification metrics that better align with implicit user preferences based on the task and context. The existing ME strategy so far is based on the assumption that users can most easily provide preference feedback over classifier statistics such as confusion matrices. This work examines ME, by providing a first ever implementation of the ME strategy. Specifically, we create a web-based ME interface and conduct a user study that elicits users' preferred metrics in a binary classification setting. We discuss the study findings and present guidelines for future research in this direction.
translated by 谷歌翻译
Learning-based image compression has improved to a level where it can outperform traditional image codecs such as HEVC and VVC in terms of coding performance. In addition to good compression performance, device interoperability is essential for a compression codec to be deployed, i.e., encoding and decoding on different CPUs or GPUs should be error-free and with negligible performance reduction. In this paper, we present a method to solve the device interoperability problem of a state-of-the-art image compression network. We implement quantization to entropy networks which output entropy parameters. We suggest a simple method which can ensure cross-platform encoding and decoding, and can be implemented quickly with minor performance deviation, of 0.3% BD-rate, from floating point model results.
translated by 谷歌翻译